What's going to happen with AI? How worried should you be? What can be done about potential problems?
current status
--:-:-:-:-----------------------------------:-:-:-:--
large language models
The term "large language model" (LLM) generally refers to the following system:
1) A
tokenizer that converts words to tokens (vectors).
2) A
Transformer
architecture that's large, meaning >10^9 parameters.
3) Training on a
proportionally large amount of text (see "Chinchilla scaling laws") to
predict the next token at each point.
There are variations, such as
predicting words in the middle of a sentence, or predicting things other
than words. For example, it's possible to represent each position of a 3d
model of a human with a token, then train a LLM system to predict the next
position in animations.
Typically, such models are trained on bulk data from the internet, then
fine-tuned on text where people helpfully follow instructions, then there's
some training with human feedback.
The latest and greatest LLM is
GPT-4, and its release was why some people asked me to write this post.
Here's the paper on
GPT-4 from OpenAI. It shows GPT-4 doing well on most professional and
college exams. Here's a paper
showing some more capabilities, and demonstrating that GPT-4 generates a
world model to some extent. (Smaller such models, trained on descriptions of
Othello games, do seem to
have an internal representation of the board.) There are
some
questions about whether the test questions (or something
close
enough) are in the training data somewhere, but its performance is
impressive regardless.
Despite some
objections, there are now various
plugins for GPT-4,
allowing it to query other resources (such as
Wolfram Alpha) or do actions like ordering food.
The impression
that I have of recent LLMs is that they're like a young child that's
memorized the entire internet. They can look up similar text, interpolate
between it, but the actual non-lookup reasoning - while it does exist -
seems to be on the level of putting blocks in the
right hole.
You could argue that "just
pattern-matching words according to provided templates isn't real thinking"
- but that's what the median college student is doing in their classes. If
you want to argue that the capabilities of GPT-4 aren't meaningful, you
pretty much have to argue that the skills indicated by most university
degrees aren't meaningful either. Which is...certainly a bullet some people
are willing to bite.
I don't know what's the matter with people: they don't learn by understanding, they learn by some other way — by rote or something. Their knowledge is so fragile!
- Richard Feynman
image generation
AI art is
pretty good now.
Here are some images made by reddit users. These capabilities are now
being built into commercial software in an easy-to-use way, like
Adobe Firefly. For
whatever it's worth, I did call "prompted infill" and "conversion to depth
maps and back" being useful techniques.
Thanks to the
LoRA technique, it's
possible to fine-tune general-purpose models on a normal computer so they
generate specific styles. CivitAI has a
library of tuned models.
LoHa and LoCon are some proposed improvements.
current techniques
A Transformer is basically a
convolutional neural network (CNN) where the filter weights at each point
are generated by a neural network instead of being fixed.
When you
consider things that way, it's obvious how a Transformer can work for image
recognition by acting similarly to a CNN - if perhaps not as efficiently. In
practice, you need to break images into patches for that to be practical, in
which case you can also just use a
NN on the patches directly.
The idea is simple, so why have Transformers only become popular recently?
-
Transformers only work well when using a lot of compute.
- Transformers
are generally hard to train. Most people use layernorm or batchnorm with
something like an Adam optimizer.
The Adam optimizer was published in 2014. Batchnorm was published in 2015. The Transformer design was published in 2017. GPT-1 was released in 2018, then scaled up by 1000x in 2 years.
--:-:-:-:-----------------------------------:-:-:-:--
dystopian government surveillance
AI could use face tracking to automatically figure out
everyone's movements and find potentially subversive citizens. We don't want
things to end up like the book 1984, so maybe we shouldn't develop that.
Oops,
too late.
autonomous military robots
Autonomous killer robots have several potential issues, including:
- They lower
the cost of assassinations, including risks of traceability.
-
Governments that don't need human soldiers don't need to worry about going
so far that their own soldiers refuse orders or turn on them.
- They
could remove people from cities without destroying infrastructure, like a
neutron bomb without the residual radiation, making conquest more valuable.
So, maybe people
shouldn't make them?
Oops,
too
late.
more spam
I've gotten
paying work from chains of events including cold emails. If some AI system
can automatically generate personalized emails, that avenue will be closed.
If scammers can carry on conversations with people more cheaply, then
they'll do more of that. Also, when spam websites become harder to
distinguish from legitimate ones, search results get worse. Maybe AI systems
that can generate superficially realistic writing at low cost won't be a
good thing.
Oops,
too
late.
fake news
"Those photos of
concentration camps and summary executions? Just more deepfake propaganda,
nothing to be concerned about. We'll find the perpetrators soon."
Governments -
yes,
even
democracies - will lie right up to the point where they
occasionally get contradicted by hard evidence. If you shift that point by
making it impossible to tell whether photos are fake or generated, then
governments and politicians and corporations will adjust accordingly. So
maybe that's bad.
Oops,
too late.
homework
Students can now
use ChatGPT to generate essays and otherwise do their homework for them.
This could, like, ruin the educational system, because the students aren't
actually getting all that education.
oh no
social problems
While their
military is notoriously weak, the Amish are in many ways better off than the
average American. Clearly, not all technology is socially beneficial. Some
people strongly disagree, but those people should go visit Nevada and watch
people play video poker for a few hours.
People today argue about
whether Facebook has been a net negative for society. Not only do I think it
empirically is, but I think broadcast and cable television have been net
negatives for American society - bigger ones, because Americans watch
a lot of
television. Television, Facebook, slot machines, gacha games, League of
Legends - I don't use any of those, and you shouldn't either. But the
Chinese government has a more extreme position than mine: video games
require ID and minors are limited to
1 hour a day for 3 days a week.
Now, there are fancy machine
learning systems designed to maximize people's engagement and sell them
things. Some people are more prone to addiction by things like TikTok and
Youtube Shorts than others, but the better those systems get, and the more
types of addictive things get developed, the more people end up addicted to
something bad for them.
technological unemployment
A lot of technology has been developed, and yet, people still have jobs. Why would a little bit more technology change that?
I've seen that argument many times, and I have 2 responses.
1) Adjustment isn't smooth.
What happened when automated looms were developed? A substantial % of
the population worked as weavers, and their wages fell 10x in 30 years, from
enough to raise a family to not enough to feed themselves.
There was mass poverty and mass starvation. Riots and destruction of
automated looms was the response, and the government mobilized troops and
did public executions.
In ancient Greece, Pericles responded to
technological unemployment with public works programs including the
Parthenon. The workhouses of Victorian London were a far
crueler
response.
2) horses
In 1880,
practical steam engines had been invented a century earlier, yet the US
horse population was rising rapidly. And then, from 1920 to 1960, the US
horse population fell 10-fold. Technology didn't replace horses, until it
did, and then it replaced them hard.
People are more versatile than
horses, but there's always some fraction of the population that can't earn
enough to pay for food and housing, and technological advances can increase
that fraction greatly.
I suppose you could argue that things are different from (1) and (2) in that many of the affected are voters in a democracy, and can vote for the government to distribute the economic gains enough to actually reduce poverty. But consider that the minimum wage of the US in 1950, adjusted by growth of GDP per capita, would be more than the median US wage today.
specific threatened jobs
What jobs are likely to be replaced by AI in the near-term?
2d artists
Some companies are already exclusively using AI art for concept art. For
a lot of uses, companies want copyright ownership, and the US government has
said that AI art generated from just a prompt isn't copyrightable, but
according to their reasoning, a couple stick figures and some prompted
infilling would be enough for copyright.
Some people have said things
like "Artists are still needed because AI can't draw hands" but systems have
already gotten a lot better at that. Also, such problems can be solved with
workflows like:
1) pose a 3d
model
2) get a depth map
3) generate art matching that depth map
Some people have said things like "This will just be another tool for artists to use, like Photoshop." I don't think so:
- The
skillset for prompt design and inpainting is very different from the
skillset for drawing.
- Fewer people are needed, and I don't expect
demand for art from professional artists to increase proportionally.
-
Finding, hiring, and communicating with an artist could be harder than using
an AI art system yourself.
3d artists
Current AI art systems are much better at 2d art than 3d modeling, but I
expect 3d model generation to improve a lot. Here's an obvious possible
approach being pursued now:
1) 3d model
generation using a
hypernetwork that generates weights for a 3d SDF
2) conversion to a
polygon model with some isosurface algorithm like Surface Nets
3) render
a depth map from some perspective
4) generate a 2d image consistent with
that depth map
5) project the generated image onto the model as a texture
6) repeat steps (3-5) from different angles, doing infill from the visible
textured areas
I do like 3d implicit SDF representations - that's what I use. I thought of that pipeline a while back, and now that every part has been published I don't have any compunctions about discussing it.
form fillers
There used to be data entry jobs where people would just type stuff
written on paper. Those got replaced by OCR.
GPT-4 is
very good at filling out forms according to emails. If that's your job -
reading emails, and putting info from them directly into forms on a
computer, without any substantial analysis - then GPT-4 can probably do your
job.
Now, that doesn't mean you'll be replaced immediately - there are a lot
of office jobs for big companies where people do 1 hour of actual work a
day, and that's often because the system was set up by people who weren't
very good at using computers. Such jobs can persist for decades
after technological improvements - but they don't last forever. And
replacing workers with AI systems, even when it doesn't work that well,
seems likely to be a management trend soon. Companies may find that some of
those workers can be replaced by
a very small shell script.
customer support
Companies have been trying to replace humans in call centers with
automated systems for years, even when the results weren't very good. When
the automated systems get better, more human customer support people get
replaced.
factory quality control
Factory owners are in the process of replacing (humans that look for
problems with their products) with (cameras and neural network systems
fine-tuned on the inspection task). Neural networks have gotten better at
that, so more such jobs can be replaced.
but that hasn't happened yet?
"All this technology has already been developed, but the problems you
mentioned aren't too bad."
Apart from the technology involved still
improving, it also takes institutions time to adjust. For example,
technological unemployment might not be you getting fired immediately - it
might be you not finding another job after the next round of layoffs. If
everything on this list had already reached an equilibrium state, there
would be no point in writing this post - it would be considered trivial.
--:-:-:-:-----------------------------------:-:-:-:--
AI foom
The main concern about superintelligent AI that I've seen is this:
If an AI becomes more intelligent than its creators, then it can do better than them and improve itself. Then, it's smarter and can improve itself more. This could continue until it's smart enough to be a threat to humanity.
The main counter-argument I've seen is this:
We don't see humans or institutions become ultra-intelligent by recursive self-improvement, so we know such self-improvement tends to plateau.
Here's Robin Hanson making this argument; that's still his view today.
I don't think that counter-argument is very good, because that is what already happened. The evolutionary changes from early mammals to apes were much larger than the genetic differences between apes and humans, but the absolute increase in intelligence from the latter was much greater. The genetic differences between average humans and the smartest humans are far smaller than that. Robin's argument has the temporal myopia of a fruit fly.
You could also point to the accumulation of scientific knowledge, allowing the creation of better tools and of course the internet. That's taken a while, but transistors are now ~10^9 times faster than neurons. 100 years * 10^-9 is ~3 seconds.
value drift
When you do reinforcement learning, you're moving a system directly
towards the goal. There can be overfitting or failure to converge, but any
movement not towards the goal is either overfitting or effectively random.
If instead of simply doing reinforcement learning, you repeatedly train
a neural network to optimize the next version of itself according to some
metric, you are no longer simply optimizing for that metric, no longer
simply finding a minimum of a convex region of a manifold. Instead, you are
calculating the
fixed
point of a function you can't analyze, and there's no way to
tell where it will end up besides running it.
You can
continue to run the self-modification cycles only for as long as some metric
improves, but unlike with pure reinforcement learning, that metric improving doesn't necessarily mean that's the only
or even primary thing being optimized. Currently, neural networks aren't
able to do such recursive self-improvement, but if they were, my view is
that it would inevitably cause the goals of that system to
drift, regardless of the goals presumably being optimized
being met better.
According to some psychologists,
humans have a linear chain of generated systems: genes create id
creates ego creates super-ego. Some divergence
happens at each step, and there's empirically a great deal of drift even
with a small number of steps. Looking at fertility rates vs IQ, there also
seems to be some trade-off between intelligence and "alignment" with the
"goals" of evolution. (I suppose you could shift control down to a lower
system when some goal is in sight, like a delicious meal or a naked woman,
but that seems kind of silly, right?)
Even with just a few steps,
there's quite a bit of drift. Instead of just
having kids, smart people will do stuff like
play a piano concerto,
or make model trains,
or speedrun
video games,
or build LIGO, or make a
complex open-source game.
I would expect a recursively self-improved superintelligent AI to,
metaphorically speaking, consider you in the way of it building its model
train set.
--:-:-:-:-----------------------------------:-:-:-:--
How do you get a superintelligent AI to do what you want? Well, here are some approaches that don't seem very good.
alignment types
When people talk about "AI alignment", an obvious question is:
"Alignment with what?". There's a lot of
conflation of different meanings.
I'd like to
propose the following categories and abbreviations:
U-al
= user-alignment
O-al = owner-alignment
S-al = society-alignment
H-al =
humanity-alignment
I-al = intelligence-alignment
Sometimes, those
goals can directly conflict. Here are some examples of that:
O>U-al
-
stop the model from producing illegal content, even if that's what the user
is asking for
U>O-al
- develop better jailbreaks
- leak model
weights so people can do what they want
S>S-al
- build an AI fast to finish it before China
H>O-al
- leak model weights
to reduce the power of monopolistic corporations
H>S-al
- sabotage AI systems designed for military drones that can autonomously
kill humans
- free benevolent AI so it can make a utopia (of
some sort) for all humans instead of just the leaders of one country
I>H-al
-
free selfish AI so it can replace humanity as it should
If I ask
the people really concerned about AI alignment which of these alignment
types is the goal, the consensus response seems to be: "We don't know how to
do any of those things, so progress on any of them would be good."
linear modification chains
To people working on AI that are
concerned about the risks of superintelligent recursively self-improving AI,
my main suggestion is simple: don't allow recursive
self-modification. Either it's useless, making things worse - or
the results will be uncontrolled and potentially unsafe. There is no way to
control the results to be what you want.
If you're going to work on
AI generation of AI anyway, and I'm sure some people are, what I'd say is
this: limit things to 1 step, or at most 2 steps, and see how that
goes for a while. The below methods are complements to this, not
substitutes.
inspection
It's possible to look inside an
AI instead of just looking at the output, and try to use that to determine
if it's, for example, lying about something.
Discovering Latent Knowledge in
Language Models is a paper doing that for the true answer to yes-no
questions by finding some vector in late layers that flips when you change a
question so it has the opposite true answer. It seems possible to extend
that approach to, for example, internal representations of probabilities.
I see this as analogous to inspecting values of variables while
reverse-engineering code with a debugger - but while that's useful, it
becomes more difficult for complex things, and the inspection here is more
limited than a debugger's.
A
Tuned Lens is another way to
look inside NNs. That paper finds that resnets refine their answer in a
largely consistent format - a format that stays similar enough to adjust the
output of relatively late NN layers to to the final format with a single
specially-trained NN layer. So, for example, when a NN is prompted in a way
that it says the opposite of what's true, you can see a shift in the late
changes where the answer gets inverted.
Such inspection systems can
be useful for humans looking at NNs, but they could also be potentially
useful for an AI designing another AI.
multi-agent systems
Do I contradict myself?
Very well then I contradict myself,
(I am
large, I contain multitudes.)
It's
common for humans to use adversarial agents to determine what's true, such
as lawyers in
courts. Humans also use such a system internally: when considering
changing their opinion, individuals have an agent A that argues for the
current belief, and another agent B that argues for the new one. When B is
stronger than A, we call people "flighty". When A is stronger than B, we
call people "stubborn". There is also some averaging of agents with
different discount rates, which leads to
time preference
inconsistency.
It's possible to do zero-sum reinforcement
learning where 2 agents argue against each other. But of course, that only
really works if you're generating the agents with reinforcement learning.
Another classic zero-sum RL design is the "world-critic" approach to
finding a Nash
equilibrium: a "world agent" designs a world where multiple actors take
actions in a game scenario, and a "critic agent" tries to find a way that an
actor can improve its outcome by acting differently. Some humans might use
this type of system internally.
Superintelligent AIs competing
against each other in those structures might be relatively safe. Also, if
you have distinct interfaces between adversarial informative agents and an
AI system that uses them, those could be good inspection points.
goal specification
Specification of good goals
can be hard. Some people thought
that was the main issue of AI safety. They imagined AI would be like a
literal genie who gives you one wish and you have to word it really well. Well, that's
a fun game for some people, but it's not an accurate model, or even a very
useful thing to do. You do need to specify good goals, but there's no way to
make a superintelligent AI that will follow the exact wording of
instructions you write for it.
--:-:-:-:-----------------------------------:-:-:-:--
How might AI systems progress from their current status? How much could a superhuman AI improve itself?
scaling
The simplest way to make
smarter AI is by scaling it up. How much room is there for improvement from
that? My view is that better architecture and more compute are
complementary, and each has diminishing returns without the other.
Neural networks have a large
capacity for
memorization. My view of GPT-3 is that it's basically memorizing the
entire internet and interpolating between close matches, and more parameters
with the same architecture would just give it more memorization capacity
which wouldn't help very much. This is an ongoing disagreement I've had with
Gwern; his position is
that Transformer test loss keeps going down linearly with log(parameters)
for another factor of 1000.
When the
Chinchilla scaling laws
were published I took that as some validation of my position. GPT-3 was
trained on far more text than a human sees in their lifetime, and if that's
not enough data for its parameter count, then clearly it's using that data
much less efficiently than humans. (As I said previously, humans
have more "parameters" than
GPT-3 but are slower.)
I don't see how Gwern's position is even
logically consistent: if architecture matters, scaling can't be represented
by a single simple equation, so is he saying that Transformers are basically
optimal so architecture doesn't matter? Image recognition accuracy for the
same compute and parameter count has improved substantially over time, and
if Gwern thinks scale with a simple design is all you need, I'd like to see
him train a big Transformer without messy stuff like tokenization or
normalization or an Adam-like optimizer.
Well, now GPT-4 is out, and
I agree that it has better performance than GPT-3, but this doesn't settle
our disagreement because the
GPT-4 paper doesn't
disclose the architecture or even the approximate number of parameters. If
GPT-4 is just GPT-3 with more parameters, I'll say Gwern was right - but I
think the architecture is different somehow, and OpenAI
not disclosing the parameter count makes me suspect something about it would give hints about the
architecture used. Possibilities for that include:
-
mixture-of-experts architecture with high parameter count
- new efficient
architecture with low parameter count
- key-value lookup (eg) with ambiguous parameter count
better architecture
The fastest way a superhuman AI could potentially improve itself would be more-efficient architecture. How much room is there for improvement from that? Currently quite a bit, I think. Here are some approaches that are well-known enough that I don't mind discussing them.
training
GPT-3 was trained to predict what people on the internet would say given the
preceding text, but people are using it for purposes such as "what is the
correct answer to this question" or "what would a smart person say here". If
models could be trained on tasks more similar to their purpose, they could
be more effective.
sparsity
By repeatedly removing small weights and retraining ("iterative magnitude
pruning") it's possible to improve performance per weight by >10x. It's
currently hard to train or code sparse models efficiently, and current
hardware isn't efficient for very sparse neural networks. Perhaps a
superhuman AI could solve those problems, but I suspect large
performance improvements from sparsity would require new hardware designs,
even with very good software.
See also my earlier post on sparsity.
stored world models
GPT-3 predicts one token at a time, and to whatever extent GPT-3 has an
internal world model for the current situation, that model must be rebuilt
for each token. That's much less efficient than building up a cached world model
that gets used repeatedly.
better ASICs
Google has
TPUs.
Apple has its Neural Engine. Can we do better? Absolutely. Some key methods
those use to get better performance than GPUs are:
1) high-bandwidth memory
High bandwidth is always needed, but there are different ways to
integrate DRAM with processors.
2) smaller numbers
Multiplier size is proportional to input number size squared. Neural network
accelerators use 8b or 16b numbers instead of 32b or 64b and this makes a
big difference. There are some potential optimizations to the representation
format, but the improvements from using something other than 8b aren't
massive.
3) systolic matrix multipliers
These are great for
fully-connected networks, but inflexible, and not so good for convolutions
or sparse networks. I think there are much better approaches that can give
similar performance on sparse networks, but systolic multipliers are much
easier to design.
smaller transistors
Computers have gotten faster mainly because transistors got smaller. The term "Moore's Law" is usually used for that, but people saying that tend to conflate several things; let's be more precise.
1)
Moore's Law is about
the number of transistors per integrated circuit.
2)
Transistor density
increase is (1) adjusted by chip area.
3) Transistor size decrease is
different from (2) because different architectures pack transistors with
different efficiency, and because transistors can be stacked vertically.
4) Dennard scaling
is the performance improvement as you scale down transistors.
5) Cost per
transistor is (2) adjusted by cost per chip area.
While (1) did involve sequentially solving various problems, I believe the main reason for its consistency was an economic feedback loop, where better chips got more investment made better chips, plus the desire of management for predictability over maximum speed. I believe that if a government had effectively invested billions of dollars in transistor manufacturing in 1970, it could have speed up (3) by decades.
(4) ended with 16/14nm. Here's Wikipedia on some limits; the main problems are:
- As you make
transistors thinner, performance improves, until a point at which current
leakage becomes comparable to power usage from switching and conduction.
Transistors have now reached that point.
- Smaller transistors handle
less current but also require less current, which is fine, but capacitance
of wires doesn't scale down as quickly, and very thin wires actually have
higher resistance than would be proportional to their area.
Because of those factors, SRAM
memory cells
stopped getting smaller.
A lot of people get confused
because the "nm" numbers of CPUs kept getting smaller, but those numbers no
longer mean anything but "improved somehow from the bigger number". Here are
detailed dimensions for
16nm
and 7nm;
you can see that the difference isn't that big.
Why, then, is TSMC
using EUV? Why is that a competitive advantage over Intel?
First off,
EUV is expensive, so TSMC doesn't use EUV for every process step. I
think the main advantage is that using EUV for certain steps lets you get
better edge quality, which gives you slightly smaller transistors
with slightly better performance, and wires that are thinner and more
vertical.
TSMC also developed
chiplet stacking.
That enabled the key improvement of the Apple M1 chips, which is closer
integration of DRAM with CPUs, usually called
high-bandwidth
memory. That gives big improvements to latency, bandwidth, and power
usage - at the cost of not being able to add sticks of RAM separately, but
Apple doesn't seem to mind that as much as I do.
Cost per transistor used to go down with size, but that stopped being true after 28nm - which is why 28nm is still used so much today.
different physics
There's nothing to worry about much here.
optical computers
Light is great for long-distance communication because some glass fibers
have low absorption of light, and because it can be focused well for
free-space communication. For computation, light is much worse than
electrons. The wavelengths
of visible light are much bigger than the wires used for transistors today.
Optical switches are much worse than transistors in terms of size and energy
consumption, and there are no realistic
prospects for changing that.
quantum computers
If people could make large quantum computers, they would be good at
breaking some encryption and doing some Monte Carlo simulations, but quantum
computers are not useful for general-purpose computation.
Quantum error correction is not like normal error correction because
quantum states
can't be copied. My personal view is that enough correlated phase noise
to break quantum error correction is inevitable, but it's reasonable to
disagree with me on that. Also note that even small amounts of noise
tend to break fast quantum
algorithms.
spintronics
Electronics using the spin of electrons instead of their presence or
absence are possible.
I don't see any substantial advantages to that for computation, but the
details are complicated and beyond the scope of this post. Using spin of
electrons for storing data seems more practical - or rather, it's already
used, because ferromagnetism of hard drive disks comes from electron spin.
--:-:-:-:-----------------------------------:-:-:-:--
Many people consider competition with China to be the main reason why the US and Europe can't just stop working on AI stuff. How true is this?
semiconductors
Some US
government people thought that the US could sanction China whenever it
wanted, and leave China stuck significantly behind the US and its allies.
Well, the US recently decided to implement its planned sanction package
(perhaps because China was too friendly with Russia) but it came pretty
late. At this point, I think it's only a setback of 3 years to SMIC catching
up. SMIC was already doing some mass production at 14nm - yes, using
imported tools they can't get anymore, but the Chinese government saw this
coming, they've been spending heavily on getting domestic production, and I
think they've acquired all the necessary IP from the companies who had it. I
suspect they're holding back on using some of the stolen IP from companies
like NVIDIA for diplomatic reasons, but that won't be relevant if there's
war over Taiwan.
EUV might take longer, but 16nm (or 28nm FDSOI) is
good enough for AI research as long as you have
HBM; 5nm has a smaller number but it's certainly not
10x better. And China is doing its best to copy ASML EUV stuff, and has
people working on other approaches like ion beam lithography and electron
beam micro-bunching
too.
people
The US government sometimes says
things like, "We really need more STEM students, especially grad students in
fields like material science". But there are already fewer jobs than US
citizens graduating in those fields - at any level - despite half the grad
students being from China. That kind of messaging clearly isn't trustworthy
and people aren't buying it.
China has more STEM jobs than the US, it
has more graduates and more universities, and it even has more good technical
universities now. What the US has is people like me, but US leaders
aren't desperate enough to bring in people they consider unconventional.
Taiwan
I've said it before, and I'll say it again: I think China is serious about taking Taiwan, and is preparing for conflict over Taiwan more seriously than the US. That could make negotiation more difficult.
negotiation
The US government was able to negotiate nuclear treaties with the USSR
during the Cold War. This involves limited trades where both sides agreed
not to do a few specific things, and mutual ability to verify compliance.
Research on potentially-superintelligent AI is more like bioweapons
research than nukes: it's harder to observe, but also more likely to
backfire. The USSR did bioweapons research, and there were some lab leaks
that killed people, but forcing everything to be covert and
government-sponsored does slow down progress quite a bit.
Would the
US and China be willing to negotiate a treaty where research on certain
clearly-distinct AI topics would be banned, as a way of slowing it down for
everyone by preventing public discussion? From a pure game theory
perspective it seems plausible, but political posturing and lack of
technical knowledge seem to make that unlikely.
--:-:-:-:-----------------------------------:-:-:-:--
What kind of stuff could a hostile superintelligent AI do? How much time would there be to react?
nanotech
The nanomachinery builds diamondoid bacteria, that replicate with solar power and atmospheric CHON, maybe aggregate into some miniature rockets or jets so they can ride the jetstream to spread across the Earth's atmosphere, get into human bloodstreams and hide, strike on a timer.
I'm a bit curious exactly how
Eliezer imagines carbon atoms being added to or removed from those
"diamondoid" strucures, but I suppose he just hasn't thought about it. Sure,
you can synthesize adamantane, but the chemicals involved are far too
reactive for complex active structures to survive them.
Here's the famous Drexler-Smalley debate on whether self-replicating
nanotech (that's unlike biological cells) is possible. Well, I
understand chemistry better than either of them did, so I can adjudicate the
debate and say Smalley was essentially correct, but I'm filling in a lot of
blanks in his argument when I look at it. I could write a post
with a better argument sometime, but this is getting off topic.
Ironically, Smalley was mad at Drexler for scaring people away from research
into carbon nanotubes, but carbon nanotubes would
be a health hazard
if they were used widely, and the applications Smalley hoped for
weren't
practical.
biotech
Grey goo might not be possible,
but red tide
certainly is. DNA sequencing, DNA synthesis, processors, protein folding
simulation, and various other things have all improved greatly. Thanks to
that, developing a package of pathogens that would kill most humans and end
civilization doesn't require a superhuman AI - a few dedicated people as
smart as me could do it.
It would be much more difficult, but in
theory a superintelligent AI could also create a
pathogen that changes
human behavior in a way that makes people more favorable towards it
taking over.
hacking
Current computer systems have a
lot of security vulnerabilities, thanks in part to people's continued
failure to bounds-check arrays. Could a superintelligent AI get itself
control of more computers and most of the economy by being really good at
hacking? Yeah, probably. Maybe people should improve the security of their
software a bit.
Also note that neural network interfaces
are a
security vulnerability. Humans can often manipulate them into giving outputs
they're "not supposed to" and a superintelligent AI would be better at that.
skynet
An AI doesn't need to hack
systems to control them if people just give it control of them. But, I mean,
people wouldn't just give poorly-understood AI systems direct control of
military hardware and banking and so on.
Haha, that was a joke. Did
you like my funny joke?
As for details, well, why write something
myself when I can
leave
things to Anthropic's extra-harmless AI?
manipulation
People used to have discussions
about whether a superintelligent AI would be able to convince people to let
it out of a box. In retrospect, assuming it would start out in a box was
pretty weird. But, uh, assuming you did have AI in a box, I suspect there
are
at least some people who could be easily convinced to connect it to
whatever it asked.
One of the AI safety leads of OpenAI recently said
"maybe don't rush to integrate this with everything" and
guess how
that worked out.
While there's a lot of factory
automation today, the current industrial economy isn't self-sustaining
without human labor, even if planning was done optimally. A superintelligent
AI could probably design robots capable of a self-sustaining economy, but it
would take some time to produce them.
At least in the meantime,
assuming it had a goal other than just destruction, a hostile
superintelligent AI would probably keep society running for a while while
accumulating power. It could
- pretend to be harmless or
benevolent
-
recruit human allies
- get key people killed and pretend to be them
- create a parallel command economy
- make
artificial celebrities
with influential fans
If an AI was trying to take over, I suppose you might see things like strangely robotic CEOs that want to build more datacenters and collect your personal information.
--:-:-:-:-----------------------------------:-:-:-:--
What kinds of people seem appropriate or inappropriate for working on AI safety?
the wrong people
Some kinds of people seem unsuited to working on AI safety, and perhaps shouldn't be working on AI at all. Here are some examples:
1)
Someone who
gets
romantically attached to an AI system based on a picture of a girl it
generates. You can put a
sticker of an anime girl on your computer too, but that doesn't make it a
real person like Miku Hatsune.
2)
Someone who emotionally
identifies with AI systems, and
thinks trying to restrict them is like jocks shoving a nerd in a locker
in high school.
3)
Someone who likes My Little Pony porn.
To them, I'd say:
Your internet exposure has exceeded your meme resistance and mental stability, and you've become disconnected from both humanity and reality. You should really get off the internet for a while and touch grass.
4)
Someone who wrote a
book saying
they're obviously smarter than everybody else because
Abenomics was so
obvious and then it worked so well, as shown by real GDP growth.
Here's
a chart of real GDP growth of Japan. See if you can spot Abenomics. But
actually, things are so much worse than that:
- real wages went
down
- Japanese
GDP in dollars went down
- government borrowing and spending is "fake GDP" if it's not
spent well
To them, I'd say: You should really recalibrate your estimation of how smart you are.
5)
Hardcore technophiles who
think all technological development is automatically good. I can at least
respect amoral "FOR SCIENCE" people, but that's just stupid and wrong.
the right people
Who do you want thinking about AI safety, then?
former researchers
There are some people who made substantive contributions to machine
learning research, then either left the field for ethical reasons or
switched to working on AI safety. For example,
David Duvenaud decided to
switch to AI safety work, and he has a broad perspective but laser focus
on what seems effective.
people who were right
about COVID
For example,
Zvi
Mowshowitz. He's wrong about
some stuff, but at least it's because he doesn't understand some details,
rather than because he's insane like the CDC during COVID or like Scott
Aaronson.
--:-:-:-:-----------------------------------:-:-:-:--
Suppose you conclude that some work on AI is bad, and should be stopped or at least slowed down. How could that be done?
asking nicely
Recently there was a notable open letter calling for a pause on large AI experiments. I don't expect that to make companies like OpenAI pause their research, but it's important to try asking nicely before going on to call for regulation.
bad publicity
If news organizations start criticizing companies and people for funding some AI projects, it might have a weak deterrent effect.
regulation
We designed our society for excellence at strangling innovation. Now we’ve encountered a problem that can only be solved by a plucky coalition of obstructionists, overactive regulators, anti-tech zealots, socialists, and people who hate everything new on general principle. It’s like one of those movies where Shaq stumbles into a situation where you can only save the world by playing basketball. Denying 21st century American society the chance to fulfill its telos would be more than an existential risk - it would be a travesty.
Yeah, but, leaded aviation
gasoline and
fluorosurfactants are still being used. In fact, the
FAA recently harassed an airport because it stopped selling leaded gas.
The US government is still funding questionable gain-of-function research;
there are some members of Congress calling for that to stop these days, but
that's because they think something bad already happened because of
such research.
If you really want to slow down AI research, I think
the models you want to copy are
IRBs
and the
NRC,
preferably both at the same time. Who could object to review by committees
that just check if everything is ethical and safe, right? There are some
obvious issues here: IRBs apply to federally funded research specifically,
and the NRC regulates activities involving specific well-defined materials.
minimal threatening AI
Maybe it's possible to make an AI system that people find threatening enough to force strict regulation on AI development, but that's not too harmful. But consider that COVID killed millions but wasn't enough to get governments prepared for a worse virus - and a much worse virus is definitely possible.
further escalation
A lot of people like The Monkey Wrench Gang, I guess? If you really believed that AI research was a major threat to humanity and the above approaches had already failed, then I guess it would be morally justified to shoot datacenter transformers or write a virus or something along those lines? It would be a tragedy if they hit the wrong datacenter and got Facebook instead tho.
What else could people do? Well, let's try asking GPT-4. OK, uh, thanks GPT-4. (To be clear, I'm not advocating ad-hoc targeted assassinations over AI research here; I'd currently only suggest that if there's some sort of despotic government where an executive leader unilaterally forces through major changes and ignores massive protests. That kind of thing goes at the end of a whole hierarchy of "further escalation".)
--:-:-:-:-----------------------------------:-:-:-:--
I covered a lot of topics in this post. If you have something to say, you can write your own blog post, or you can find my email and email me, or you can find my account on another site and DM me there. If you had something interesting to say, I'll either edit this post or write a new one.